Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM

نویسندگان

  • Longbiao Wang
  • Norihide Kitaoka
  • Seiichi Nakagawa
چکیده

In this paper, we propose a robust speaker recognition method based on position-dependent Cepstral Mean Normalization (CMN) to compensate for the channel distortion depending on the speaker position. In the training stage, the system measures the transmission characteristics according to the speaker positions from some grid points to the microphone in the room and estimates the compensation parameters a priori. In the recognition stage, the system estimates the speaker position and adopts the estimated compensation parameters corresponding to the estimated position, and then the system applies the CMN to the speech and performs speaker recognition. In our past study, we proposed a new text-independent speaker recognition method by combining speaker-specific Gaussian mixture models (GMMs) with syllable-based HMMs adapted to the speakers by MAP [Nakagawa, S., Zhang, W., Takahashi, M., 2004. Text-independent speaker recognition by combining speaker-specific GMM with speaker-adapted syllable-based HMM. Proc. ICASSP-2004 1, 81– 84]. The robustness of this speaker recognition method for the change of the speaking style in close-talking environment was evaluated in (Nakagawa et al., 2004). In this paper, we extend this combination method to distant speaker recognition and integrate this method with the proposed position-dependent CMN. Our experiments showed that the proposed method improved the speaker recognition performance remarkably in a distant environment. 2007 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust distant speaker recognition based on position dependent cepstral mean normalization

In a distant environment, channel distortion may drastically degrade speaker recognition performance. In this paper, we propose a robust speaker recognition method based on position dependent Cepstral Mean Normalization (CMN) to compensate the channel distortion depending on the speaker position. It is shown in [1] that the position dependent CMN is robust for speech recognition in a distant en...

متن کامل

Text-independent speaker recognition by speaker-specific GMM and speaker adapted syllable-based HMM

We present a new text-independent speaker recognition method by combining speaker-specific Gaussian Mixture Model(GMM) with syllable-based HMM adapted by MLLR or MAP. The robustness of this speaker recognition method for speaking style’s change was evaluated. The speaker identification experiment using NTT database which consists of sentences data uttered at three speed modes (normal, fast and ...

متن کامل

Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN

In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window. Conventional short-term spectrum based Cepstral Mean Normalization (CMN) is therefore, not effective under these conditions. In this paper, we propose a robust speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that ...

متن کامل

Robust Speech Recognition in Distant Environment Based on Speaker Position and Speaking Direction Detection

In a practical environment, channel distortion may severly degrade speech recognition performance. In this paper, we propose a robust speech recognition method using real-time Cepstral Mean Normalization (CMN) [1] based on speaker position and speaking direction detection. We first estimate the speaker position in a 3-D space based on the time delay of arrival (TDOA) between distinct microphone...

متن کامل

User-customized Password Speaker Verif Gmm Model

In this paper, we present a new approach towards user-customized password speaker verification combining the advantages of hybrid HMM/ANN systems, usingArtificial Neural Networks (ANN) to estimate emission probabilities of Hidden Markov Models , and Gaussian Mixture Models. In the approach presented here, we indeed exploit the properties of hybrid HMM/ANN systems, usually resulting in high phon...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 49  شماره 

صفحات  -

تاریخ انتشار 2007